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ABSTRACT 



Sunspot activity is highly variable and challenging to forecast. Yet forecasts 
are important, since peak activity has profound effects on major geophysical 
phenomena including space weather (satellite drag, telecommunications outages) 
and has even been correlated speculatively with changes in global weather pat- 
terns. This paper investigates trends in sunspot activity, using new techniques 
for decadal-scale prediction of the present solar cycle (cycle 24). First, Hurst 
exponent H analysis is used to investigate the autocorrelation structure of the 
putative dynamics; then the Sugihara-May algorithm is used to predict the as- 
cension time and the maximum intensity of the current sunspot cycle. Here we 
report H = 0.86 for the complete sunspot number dataset (1700-2007) and H 
= 0.88 for the reliable sunspot data set (1848-2007). Using the Sugihara-May 
algorithm analysis, we forecast that cycle 24 will reach its maximum in December 
2012 at approximately 87 sunspots units. 

Subject headings: (Sun:) sunspots and activity — methods: data analysis and 
statistical 
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Introduction 



Solar radiation is far from constant. These changes can be seen in many solar activity 
indicators, such as sunspot number, sunspot a rea, total sol ar irradiance, solar flares, the 



recur rence index of geomagne tic disturban ces ((Kane 



200ll ). and solar cycle length ((Kane 



20081 ). dynamo models (IHiremath 



20081 ). However, sunspot number is the most common 



solar activity indicator, hav ing been recorded since 1700 , and used as an indicator for solar 



cycle prediction since 1913 (IKimura 



iKimura 


1913; 


Currie 


1973; 


DeMever 


1981; 


Kane 


2007) 



Any change in solar activity presents challenges in solar physics (understanding solar 
cycle mechanisms, prediction of solar events such as flares, CME, etc) or in the field of space 
weather (satellite drag, telecommunication outages, etc). Therefore solar cycle prediction 
is of vital importance now, and will be even more important in the future. Despite general 
knowledge of solar cycles, reliable forecasting of sunspot number remains problematic. 
Several statistical methods have already been applied, but due to the nonlinear nature of 
the time series, lin ear approach es are expected to fail. Other techniques have been employed 



(e.g. precursors, ((Kane 



20071 )) that offer theoretical advantages. Because of the wide 



variety of techniques used, the present solar cycle (Cycle 2 4) has been predicted to h ave a 



20081 ). In 



maximum sunspot number as low as 50 or as high as 180 (lObridko fc Shelting 
this study, we predict the length and intensity of cycle 24 using two statistical/dynamical 
methods: the Hurst exponent and the Sugihara-May algorithm. 



2. Data and Methods 



The monthly ISSN (International SunSpot Num ber) data wer e used from 1847 to 2007. 
Though data is available back to 1700, according to iKane I (120081 ) "the quality of the data 
is considered as poor during 1700 - 1748, questionable during 1749 - 1817, good during 1818 



-4- 



- 1847, and reliable since 1848." The maximum of intensity of each cycle and the time from 
the beginning of the cycle to its maximum (hereafter "ascension time") was derived for 
cycles 1 (February 1755) to 23 (March 2000). All the datasets used in this study were taken 
from National Geophysical Data Center (NGDcJl. Because standard methods are expected 
to fail to make accurate predictions, we apply two unusual techniques to this data to make 
our predictions. 

(1) The Hurst Exponent or Rescaled Range Analysis. This method was proposed by 



Hurst 



(1195 ll ) for an experimental study of long term information storage in time series 



data. This technique has had wide application in many research fie 


ds, including finance 


(Grec 


h & Mazur 


2004; 


Qian & Rasheed 


2004 


), astronomy 


Komm 


1995; 


Rozelot 


1995 


2008; 


Ruzmaikin et al 


1994 


), climate ( 


Rangarajan & Sant 


2004 


), and others. Hurst 



analysis is a simple and robust way to analyze randomness in a dataset. The parameter 
H measures the persistence of structures in the time series, indicating whether the data 
represent a pure random walk or have underlying trends. Another way to state this is that 
a random process with an underlying trend has some degree of autocorrelation. When the 
autocorrelation has a very long (or mathematically infinite), decay the process is referred to 
as a long memory process. The value of H varies from (indicating anti-persistent brown 
noise) to 0.5 (random white noise), 1.0 (indicating a strong, smooth trend). Hurst found 
that the rescaled range series (R/S) over a time window of width t is described as a power 
law: 

(R/S) t = c*t H , (1) 



where c* is a constant and H is the Hurst exponent. To estimate the value of the Hurst 
exponent, R/S is plotted versus t on log-log axes. The slope of the line ar regression gives 



the value of the Hurst exponent. More details about R/S analysis, see 



Qian fc Rasheed 



x http: / /www. ngdc.noaa.gov/stp/SOLAR/ftpsunspotnumber.html 
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f l2004h . 

(2) The Sugihara-May Algorithm. This algorithm compares a library of (known) 
past patterns to patterns seen later in the real time series. It does this by reconstructing 
an attractor from the library, locating the "present" point on the attractor, and tracing 
that point forward along the attractor 's trajectory. The dimension of the attractor is 
determined by the complexity of the process that generates it, and the number of points 
involved in tracing forward determines the nonlinearity of th e system. A similar chaos 



technique was firstly applied to the historical sunspot data by iKurths fc Ruzmaikin 



fll990h 



to determine the nonlinearity of the data set and to predict the following solar cycle. 



For a cha otic time series, t 



re acc uracy of nonlinear forecast falls off as prediction time 



increases (jSugihara &: May 



1990). This is a two step procedure. First, simplex projection 



identifies the bes t embedding dimension, which is t 



the nonlinearity (jSugihara fc May 



1990 



ren us ed in a S-map procedure to check 



Sugihara 



19941 ). A time series X of length N, 



is embedded in D-dimensional Euclidean space to create a "landscape" x of N — D — 1 
vectors, where x t : (X t -n+i, X t -n+2, X t ). The first n of these x t vectors are associated 
with output values X t+ T P - Then forecasts of the (JV — n) remaining input vectors, i.e. 
predictions, are made by 



yt+Tp 



Y.k=\ X k+Tp exp(-d k ) 

J2k=i exp(-d k ) 
dk = \\yt ~ %k\\ 



(2) 
(3) 



where the summation is taken over the D + l closest neighbors in D-dimensional Euclidean 
space. Ideally, these closest neighbors form the vertices of the smallest simplex containing 
the predictee. In this context, this predictive approach is a local approximation similar to 
kernel density estimation. To find an appropriate value of D, various values of D are applied 
to find the value that minimizes prediction error (i.e., an embedding that minimizes the 
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singularities or indeterminate crossing points of trajectories in the putative attractor). The 
optimal D is then used in an S-map to build weighted (local) linear predictions. The rate 
of decay of weight given to each point is set by 9 parameter, and it describes the degree of 
caoticity of a data set. The forecast improvement with local weighting indicates that the 
underlying dynamics are nonlinear. Hence 9 = performs best for linear time series (i.e., 9 
is the global lin ear solution), and when 9 >0 performs best, the time series is nonlinear 



(IMiyano et al 



2000h 



We investigate the significance of the historical relationship between maximum sunspot 
number and ascension time with Pearson's test. 



3. Analysis and Results 



Hurst exponent analysis was applied to the monthly ISSN data separated in two time 
intervals: 1700 to 2007 and 1848 to 2007. H= 0.86±0.008, and #=0.88±0.009 for these 
datasets respectively (Fig. 1). The bulge between 6 and 8 in panels (a) and (b) of Figure 1 



is due to the main contribution of the 11-year cycle (IRuzmaikin et al Ill994l ). 



To investigate the complexity, nonlinearity and predictability of the data, the 
Sugihara-May algorithm was applied as described above. From this analysis, it was 
discovered that the data are high-dimensional, i.e. governed by many orthogonal processes 
such as solar rotation and inner magnetic activity (Fig. 2). 

In the second step of this procedure, the S-Map analysis (Fig. 3), a weak nonlinear 
signature was detected in 8 dimensions, indicating weak chaotic behavior. 

We also tested the prediction capability of our method by predicting the second half of 
the given data set (1927-2008) using only data from 1848-1926. The model produced very 
skillful predictions, showing a correlation of 0.94 with the observed values (Fig. 4). We 
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therefore attempted to perform long-range forecasts by sequentially increasing Tp (the time 
length into the future that one tries to predict from the last data point). These results are 
mostly an echo of the previous cycle. This technique predicts that the current solar cycle 
will reach a maximum on December 2012, peaking at a sunspot number of 87.4 (Fig. 5). 



Kurths fc Ruzmaikin 



(1990) 



A similar method to our prediction analysis was used by 
for the solar cycle 22. They found that the maximum sunspot number would be about 150. 
Because the observed maximum of that cycle was 158, a rather good prediction estimate, it 
instills more confidence in the technique. 

For historical cycles, we further investigated the link between the maximum sunspot 
number and the ascension time. We found that the two variables have a strong negative 
correlation (r=0.82, df=21, and p<.001; Fig. 6). Our prediction above (87.4 sunspots in 5.1 
years from December 2007) is plausible given the historical relationship between the two 
values, as cycles with an ascending phase between 4.5 and 5.5 years have a maximum in the 
64.2 to 131.6 sunspot range. 



4. Discussion 



Our results show that the Hurst exponents of the ISSN data for the periods of time 
1700 -2007 and 1848-2007 are 0.86 ±0.008 and 0.88±0.009 respectively. This is in agreement 



with 



Mandelbrot & Wallis 



(119691 ) who first appli ed this type of analysis to the monthly 



Ruzmaikin et al 



(119941 ) analyzed the 



sunspot number, and found H = 0.86. Similarly 
14 C radiocarbon data as a proxy of solar activity, and found H= 0.84. In analyzing daily 
Doppler solar differential rotation coefficients A and B measured at Mount Wilson, USA, 



Komm 



(119951 ) found H to be 0.83 and 0.86 respectively. The slightly higher exponent 



obtained for the recent dataset indicates that more reliable sunspot data exhibits a 
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slightly stronger autocorrelation (or tendency to trend), and thereby has more statistical 
predictability than the full data set (1700-2007). Thus it is likely that errors or uncertainty 
within the unreliable data affected its inherent predictability. 

Our prediction of ascension time comp are with values already available in the literatu re, 



which range from Decembe r 2009 to 2014 (IMaris fc Oncica 



2006 



Tsirulnik et al 



Obridko fe Shelting 



19971 ). 



(120081 ) reported that the second half of 2010 or the f irst qu a rter o f 



2011 would be the most reliable extimates for th e maximum of cycle 2- 
suggested October 2011 or August of 2012, while 
maximum will occur as early as December 2009. 



Maris fc Oncica 



Kane 



(120081 ) 



(120061 ) found that cycle 



Our predi ction of Cy c 



early as 1983, 



e 24 ' s intensity also bears comparison with other forecasts. As 



would be lower th an cycle 23 



Wang et al 



Chistyakov! (11983J) claimed that cycle 23 and 24 would be low and cycle 24 



Duhau 



(120031 ) predicted a m aximum of 87.5±23 .5, and 



(120021 ) estimated a peak between 83.2 and 119.4. 



Maris et al 



that cycle 24 has to be low by analyzing the n umber of fla res. 



Hiremath 



Kane 



Javaraiah 



200J reported 



(120071 ) obtained 



(2007) obtained a value of 116 



(120071 ) predicted that sunspot 



a value of 74 from the sunspot group data and 
by using a harmonic oscillator solar cycle model, 
number would be 129.7±16.3 for the present Cycle, but later (2008) revised this estimate to 
either 140±20 in October 2011 or 90±10 in August 2012. It must be emphasized that the 
observational result for the start of cycle 24 (April 2008), is in rather good agreement with 
Kane's prediction. Thus, we can think that Kane's estimates for the ti me at which cycle 
24 wi 



1 reach its maximum could be reliable. Using neuronal prediction, 



Maris fc Oncica 



( 120061 ) predicted 145 sunspots at the peak in December 2009. 



The precursor models, which use data from the declining phase of cycle N — 1 to 



predict height and timing of cycle N, were used by 



Obridko fc Shelting 



(I2008h . They 



reported that "the precursor models based on the polar field or H a data often yield lower 
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values of the cur r ent cy cle" , which vary from 70±10 and 120±40. From the dynamo model 



Choudhuri et al 



(IDikpati &: Gilman 



(2007) have reported that cycle 24 would be 35% lower than cycle 23. 



20061 ) disagreed-using a flu x transport dynamo model t hey argued that 



Hathaway fc Wilson 



(120061 ) give a value of 



cycle 24 will be 30 - 50% higher than cycle 23. 
160 b ased on geomagnetic activit y at the minimum. From the index of the global magnetic 



field, 



Obridko fc Shelting 



(120081 ) forecast that cy cle 24 would be of "medium high , the 



same or somewhat higher than cycle 23". ! 
a nonlinear dynamo model as described by 



inally, 



Kitiashvili Kosovichev 



Kleeorin fe Ruzmaikin 



(120081 ) using 



(120081 ) which takes into 



account dynamics of the turbulent magnetic helicity, predict that the next sunspot cycle 
will be significantly weaker (by ~ 30%) than the previous cycle, continuing the trend of low 
solar activity. 

By means of the two methods utilized here, the Rescaled Range Analysis and the 
Sugihara-May algorithm, we were able to deduce from the significant trends of the ISSN 
data, both the cycle duration and its maximum intensity. Unlike previous analysis our 
forecasting results are based on models that were tested out of sample to have a high 
degree of forecast skill. Our conclusi ons therefore are (i) the reliable monthly mean sunspot 



data during 1848-2007 (IKane 



20081 ) yield a slightly higher Hurst exponent than do all 



the historical observational data; (ii) H, being greater than 0.5, shows that the sunspots 
series are highly persistent (exhibit momentum or trending); (iii) concerning cycle 24, the 
maximum intensity of 87.4 will be reached in December 2012; (iv) according to this forecast, 
the current solar cycle will have a magnitude far lower than any other since 1890-1910. 



All data sets used in this study are taken from NGDC web page. One of the authors 
(A. K.) is very thankful to the SWYA School organizing committee for providing financial 
support and to the lecturers for their valuable comments. This work which is a small part 
of Ph D thesis of A.K., was supported by the Scientific and Technical Council of Turkey by 



the project of 107T878. 
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Fig. I. — Hurst exponent analysis results deduced from monthly ISSN data sets; a) from 
1700 to 2007, b) from 1848 to 2007. The slight departure from linearity is a signature of the 
cyclicity. 
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Complexity 




Embedding DimeJiaion Embedding Dimension 



Fig. 2. — Complexity of the ISSN data. These figures show that the best embedding dimen- 
sion is moderate (D=5-8). 
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Fig. 3. — For the best embedding dimension of (D=8) as deduced from Fig. 2, the data set 
is most predictable when using a nonlinearity tuning 9 > 0, of about 0.18. For linear data 
sets this value is equal to zero. Qualitatively similar results are obtained with D=5. 
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Fig. 4. — (a) The correlation / error between observed and real data, (b) Comparison of 
observed and predicted ISSN values for the used data period (1848-2007). 
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Fig. 5. — Monthly ISSN data predictions by using Sugihara and May Algorithms. The first 
value is taken as 1918 which describes also January 2008. 
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Cycle Number 



Fig. 6. — Comparisons of sunspot numbers and ascension time of historical cycles (r=0.82, 
df=21, and p<.001). The two last values describe our prediction. 



